arXiv:1503.03753vl [cs.IR] 11 Mar 2015 


From Group Recommendations to Group Formation 


Senjuti Basu Royt, Laks V. S. Lakshmanan Rui Liut. 

t University of Washington Tacoma, * University of British Columbia 

{senjutib,ruiliu}@uw.edu,laks@cs.ubc.ca 


ABSTRACT 

There has been significant recent interest in the area of group rec¬ 
ommendations, where, given groups of users of a recommender 
system, one wants to recommend iop-k items to a group that maxi¬ 
mize the satisfaction of the group members, according to a chosen 
semantics of group satisfaction. Examples semantics of satisfaction 
of a recommended itemset to a group include the so-called least 
misery (LM) and aggregate voting (AV). We consider the comple¬ 
mentary problem of how to form groups such that the users in the 
formed groups are most satisfied with the suggested iop-k recom¬ 
mendations. We assume that the recommendations will be gener¬ 
ated according to one of the two group recommendation semantics 
- LM or AV. Rather than assuming groups are given, or rely on ad 
hoc group formation dynamics, our framework allows a strategic 
approach for forming groups of users in order to maximize satisfac¬ 
tion. We show that the problem is NP-hard to solve optimally under 
both semantics. Furthermore, we develop two efficient algorithms 
for group formation under LM and show that they achieve bounded 
absolute error. We develop efficient heuristic algorithms for group 
formation under AV. We validate our results and demonstrate the 
scalability and effectiveness of our group formation algorithms on 
two large real data sets. 


1. INTRODUCTION 

There is a proliferation of group recommender systems that cope 
with the challenge of addressing recommendations for groups of 
users. YuTV 0 is a TV program recommender for groups of 
viewers, LET’S BROWSE pT] recommends web pages to a group 
of two or more people who are browsing the web together, and Fly- 
Trap 0 recommends music to be played in a public room. What 
all these recommender systems have in common is that they as¬ 
sume that the groups are ad hoc, are formed organically and are 
provided as inputs, and focus on designing the most appropriate 
recommendation semantics and effective algorithms. In all these 
systems, all members of a group are recommended a common list 
of items to consume. Indeed, designing semantics to recommend 
items to ad-hoc groups has been a subject of recent research fT4| 
[^[^1^, and several algorithms have been developed for recom¬ 
mending items personalized to given groups of users 

We, on the other hand, study the flip problem and address the 
question, if group recommender systems follow one of these existing 
popular semantics, how best can we form groups to maximize user 
satisfaction. In fact, we pose this as an optimization problem to 
form groups in a principled manner, such that, after the groups are 
formed and the users inside the group are recommended an itemset 
to consume together, following the existing semantics and group 
recommendation algorithms, they are as satisfied as possible. 


Applications: Our strategic group formation is potentially of 
interest to all group recommender system applications, as long as 
they use certain recommendation semantics. Instead of ad-hoc group 
formation pT] \2T\ [8] [T|, or grouping individuals based on 

similarity in preferences ||22|, or meta-data (e.g., socio-demographic 
factors we explicitly embed the underlying group rec¬ 

ommendation semantics in the group formation phase, which may 
dramatically improve user satisfaction. For example, two users 
with similar socio-demographic attributes may still have very dis¬ 
tinct preferences for watching TV or listening to music; therefore, 
a meta-data based group formation strategy may place them in the 
same group, and thus a group recommendation semantics may end 
up recommending items to them that are not satisfactory to both. 
We attempt to bridge that gap and propose systematic investigation 
of the group formation problem. More concretely, our focus in this 
work is to formalize how to form a set of user groups for a given 
population, such that, the aggregate satisfaction of all groups w.r.t. 
their recommended top-k item lists, generated according to exist¬ 
ing popular group recommendation semantics using existing group 
recommendation algorithms, is maximized. 

Our work is also orthogonal to existing market based strate¬ 
gies ^ [7^ [2^ on daily deals sites, such as Groupon and LivingSo- 
cial. Most of these works focus on recommending deals (i.e., items 
or bundles of items) to users. They rely on incentivizing formation 
of groups via price discounting. The utility of such recommenda¬ 
tion strategies is to maximize revenue, whereas, our group forma¬ 
tion problem is purely designed to maximize user satisfaction. We 
elaborate on this interesting but orthogonal research direction fur¬ 
ther in the related work section. 

Travel planning for user groups is a popular group recommenda¬ 
tion application 0 , where several hundreds of travelers can reg¬ 
ister their individual preferences to visit certain points of interest 
(POIs) in a city. A travel agency may decide to support, say 25 
different user groups. Given these groups, they accordingly design 
25 different plans, where each plan consists of a list of 5 — 10 
different POIs tailored to each group. Consequently, the registered 
customers are to be partitioned to form 25 different groups and each 
group will be recommended a plan of k items (5 < k < 10), based 
on a “standard” group recommendation semantics. 

Other emerging applications, such as, recommending news p4| , 
music j^, book, restaurants p2) , and TV programs pT| to groups, 
make use of similar settings. The size of the user population, the 
number of groups, the length of the recommended item list, or the 
most appropriate group recommendation semantics may be appli¬ 
cation dependent and are best decided by the domain experts; for 
example, an online news agency may create hundreds of segments 
of their large reader-base (with several thousands of users) to serve 
the top-10 news, whereas a TV recommender system may only 



form a few groups to serve the most appropriate 3 programs to a 
family. What ties all these applications together is the applicability 
of the same underlying settings. 

For all these scenarios, we study, if group recommender systems 
follow a given semantics, how best can we can partition the user- 
base to form a pre-defined number of groups, such that the recom¬ 
mended top-k items to the groups maximize user satisfaction. We 
exploit existing popular group recommendation semantics and do 
not propose a new one. Clearly, our problem is a non-intrusive ad¬ 
dition to existing operational recommender systems and has clear 
practical impact in all these applications. 

Contributions: Our first contribution is in proposing a formal¬ 
ism to create groups in the context of an existing group recom¬ 
mender system, that naturally fits many emerging applications. In 
particular, we study the group formation problem under two pop¬ 
ular group recommendation semantics, namely least misery (LM) 
and aggregate voting (AV) p^[^ [T] [Z7|[3T| . Given an item and a 
group, LM sets the preference rating of the item for the group to be 
the preference rating of the least happy member of that group for 
that item. On the other hand, AV sets the preference rating of the 
item for the group to be the aggregated (sum) preference rating of 
that item over the members of the group. In this paper, given a user 
population of a recommender system, a set of items, and a num¬ 
ber £, we seek to partition the users into at most i non-overlapping 
groups, such that when each group is recommended a top-k list of 
items under a group recommendation semantics (LM or AV), the 
aggregate (sum) satisfaction of the created groups is maximized. 
Given a group, its satisfaction with a recommended iop-k item list 
could be measured in many ways: by considering the group prefer¬ 
ence of the most preferred item, the least preferred item (i.e., k-ih 
recommended item), or the sum of group preferences over all k 
items. These alternatives are discussed in Section |2] 

Our second contribution is computational. We provide an in- 
depth analysis of the group formation problem and prove that find¬ 
ing an optimal set of groups is NP-hard under both group recom¬ 
mendation semantics, LM and AV. We propose two efficient ap¬ 
proximation algorithms for group formation under LM with prov¬ 
able theoretical guarantees. In particular, our proposed algorithms 
GRD-LM have absolute error GD guarantees. Additionally, we 
also describe efficient heuristic algorithms GRD-AV for forming 
groups under AV semantics and analyze the complexity of both al¬ 
gorithms. We also propose an integer programming based optimal 
solution for both LM and AV semantics (referred to Section [A| in 
appendix) which will not scale, but can be used as a reference with 
which to calibrate scalable algorithms w.r.t. the quality of the solu¬ 
tion, on small data sets. 

Finally, we conduct a detailed empirical evaluation by using two 
large scale real world data sets (Yahoo! Music and MovieLens) 
to demonstrate the effectiveness as well as the scalability of our 
proposed solutions. We also compare our proposed solutions with 
intuitive baseline algorithms both qualitatively and w.r.t. various 
performance metrics. Our experimental results indicate that our 
proposed algorithms successfully form groups with high satisfac¬ 
tion scores w.r.t. the iop-k recommendations made to the groups. 
Additionally, we demonstrate that the proposed solutions are highly 
scalable and terminate within a couple of minutes in most cases. 
Furthermore, we conduct a user study involving users from Ama¬ 
zon Mechanical Turk. Our results demonstrate that our proposed 
formalism is indeed effective for forming multiple groups, with 
the individuals being highly satisfied w.r.t. the suggested top-k 
group recommendations. Based on our experimental analysis, our 
group formation algorithms consistently outperform the baseline 
algorithms both qualitatively and on various performance metrics. 


In summary, we make the following contributions: 

• We initiate the study of how to form groups in the context 
of recommender systems, considering popular group recom¬ 
mendation semantics. We formalize the task as an optimiza¬ 
tion problem, with the objective to form groups, such that 
the aggregated group satisfaction w.r.t. the suggested group 
recommendation is maximized (Section|^. 

• We provide an in-depth analysis of the problem and prove 
that finding an optimal set of groups is NP-hard under both 
LM and AV semantics (Section [^. We present several sim¬ 
ple and efficient algorithms for group formation (Sections 
and[^. We show that our algorithms for LM semantics un¬ 
der both Min and Sum aggregation, achieve a bounded ab¬ 
solute error w.r.t. the optimal solutions (Section]^. We also 
work out a clean integer programming based formulation of 
the optimal solution under both LM and AV semantics (Ap¬ 
pendix]^. 

• We conduct a comprehensive experimental study (Section [ 7 ]) 
on Yahoo! Music and MovieLens data sets and show that our 
algorithms are effective in achieving high aggregate satisfac¬ 
tion scores for user groups compared to the optimum, lead 
to relatively balanced group sizes, high average group sat¬ 
isfaction, and scale very well w.r.t. number of users, items, 
items recommended, and number of groups allowed. Our 
user study results demonstrate the effectiveness of our pro¬ 
posed solutions. 

Section [^presents extensions of the proposed work. Section!^ 
presents related work. Finally, we conclude the paper in Section]^ 
and outline future research directions. 

2. PRELIMINARIES & PROBLEM DEFIN- 
TION 

In this section, we first discuss the preliminaries and describe our 
data model. We also present two running examples that are used 
throughout the paper. Finally, we formalize the group formation 
problem in sub-section |2.4| 

2.1 Data Model 

We assume an item-set X = {A, ^2, • • •, ^m} containing m items 
and a user set U = {ui,U2,,...,Un} with n users. A group 
g corresponds to a subset of users, i.e., g ^ U. In this paper, 
we consider recommender systems with explicit feedback, which 
means users’ feedback on items is in the form of an explicit rat¬ 
ing sc{u, i) G IZ, where IZ is typically a discrete set of positive 
integers, e.g., 7^ = {1,..., 5}, with and being the mini¬ 
mum and maximum possible ratings respective (e.g., may be 
0 and may be 5). Without causing confusion, we also use 
sc{u, i) to denote the rating of item i predicted for user u by the 
recommender system^ Thus, in general, sc{u, i) denotes user u's 
preference for item i, whether user provided or system predicted. 
We sometimes also refer to sc{u, i) as the relevance of an item for 
a user. The recommended top-k item list for a group g is denoted 
Tg, where Tg C X and \Xg\ = k. Furthermore, we denote the 
k-th item score for group g as sc{g, i^), where denotes the k-th 
item (i.e., the worst item) in the top-k item list Xg recommended 
to g. We note that sc{g,i^) is a quantity that is defined accord¬ 
ing to a chosen group satisfaction semantics such as LM or AV, as 
explained in the next subsection. 

^Predicted ratings may be real numbers. 



User-item Ratings ui U 2 U 3 U 4 uq 

~Ti i 2 2 2 3 r~ 

i2 4 3 5 5 1 2 

is 3 5 1115 

Table 1: User Item Preference Rating for Examplej^ 

User-item Ratings ui U 2 us ua us uq 

~Ti 3 i 2 2 1 3~ 

Z2 1 4 5 5 2 2 

is 4 3 113 1 

Table 2: User Item Preference Rating for Example]^ 

Example 1. Imagine that the user set 
U = {ui,U 2 ,U 3 ,U 4 , U 5 ,U 6 } contains 6 members and the itemset 
X = {ii, ^ 2 , ^ 3 } has 3-items. The user’s preference for the itemset 
is given (or predicted) as in Table[^ Imagine that the user set needs 
to be partitioned into at most 3 groups (^ < 3). □ 

Example 2. Imagine that the same user set and with the same 
itemset has now different ratings, as presented in Table|^ Let us as¬ 
sume that the user set needs to be partitioned into at most 2 groups 
(£ < 2 ). □ 

2.2 Group Recommendation Semantics 

A group recommendation semantics spells out a numeric mea¬ 
sure of just how satisfied a group is with an item recommended to 
it. Two popular semantics of group recommendation that have been 
employed in the literature on group recommendations are: (i) ag¬ 
gregate voting and (ii) least misery. Given a group g and item i, the 
aggregated voting score of item i for the group is the sum of the 
preference ratings of item i for each member u G 5 ^. On the other 
hand, the least misery score of item i for group g is the minimum 
preference rating of item i across all members of g. 

Deeinition 1 (Least Misery Semantics Tlm). 

The group satisfaction score of item ifor group g is the least 
score ofi across members of g, i.e., sc{g, i) minuegsc(u, i). 

Deeinition 2 (Aggregated Voting Semantics J^av)- 
^74] \23\ [ 27 ^ The group satisfaction score of item i for group g 

is the aggregated score of ifor all members of g, i.e., sc(g, i) 

T,neg sc{u,i). 

2.3 Group Satisfaction Aggregation 

Given a group g and a list of k recommended items Xg, there 
are multiple ways of aggregating the scores of the k items in order 
to define the group g's satisfaction with the recommended list Xg. 
Some of the natural alternatives are described below. 

• Max-aggregation: Satisfaction of the group is measured as 
the score of the very top item in the list, i.e., g^{Xg) := 
sc{g,i^). 

• Min-aggregation: Satisfaction of the group is measured as 
the score of the k-th item in the recommended list, i.e., g^ {Xg ) := 
sc{g,i’‘). 

• Sum-aggregation: Satisfaction of the group is measured as 
the sum of scores of all items in the list, i.e., satisfaction of 
group g is, g^{Xg) E-^ 2 :^sc(^,z). It is also possible to 


design a Weighted Sum aggregation function, where each of 
the k items is assigned a differential weight correlated with 
its position i. We present a brief discussion of this extension 
in Section!^ 

Notice that when /c = 1, Max, Min, and Sum-aggregation coin¬ 
cide. 

2.4 Problem Definition 

Recommendation Aware Group Formation (GF): Given items 
{zi, ^2,... im} and users {zzi, zz2,... zzn}, a group recommenda¬ 
tion semantics LM or AV, two integers k and i, create a set of at 
most i non-overlapping groups, where each group g is associated 
with a top-k itemset Xg in accordance with semantics LM or AV, 
s.t.: 

• The aggregated group satisfaction of the created groups is 
maximized; i.e., maximize Ohj = 

3. COMPLEXITY ANALYSIS 

In this section, we show that the recommendation-aware group 
formation (GE) problem is NP-hard. Our hardness reduction is 
from Exact Cover by 3-Sets (X3C), known to be NP-hard fT2) . 
Since a direct reduction is involved, we first prove a helper lemma, 
which shows that a restricted version of Boolean Expected Compo¬ 
nent Sum (EC^, called Perfect ECS (PECS for short), is NP-hard. 
We then reduce PECS to GE. 

An instance of PECS consists of a collection V of m-dimensional 
boolean vectors, i.e., V C {0,1}"^, and a number K. The ques¬ 
tion is whether there exists a disjoint partition of V into K subsets 

Vi,...,Vk such that X:Li %1) = Tl- We 

have: 

Lemma 1. PECS is NP-complete. 

Prooe. The membership of PECS in NP is straightforward. 

To prove hardness, we reduce a known NP-Complete problem, 
namely. Exact Cover by 3-sets (X3C) fT^ to PECS. 

X3C: An instance X of X3C consists of a ground set Tf with 3g 
elements and a collection C — {tSi, ..., Sm} of 3-element subsets 
of X. The problem is to find if there exists a subset C' C C, such 
that C is an exact cover of X, i.e., each element of Af occurs in 
exactly one subset in C'. 

Given an instance X of X3C, we create an instance J of PECS 
as follows. Transform each element xi C X into a boolean vec¬ 
tor Vi G {0,1}"^ by setting Vi[j] — 1 if G Sj and Vi[j] — 0 
otherwise. Thus, V — {zTi, ...^vsq}, where vector Vi corresponds 
to ground element G A’. By construction, subset Sj G C cor¬ 
responds to dimension j of the vectors. Notice that at most three 
vectors have a 1 in any given dimension. Thus, instance J consists 
of the vectors V and the number K q. 

We claim that X is a YES-instance of X3C iff is a YES- 
instance of PECS. 

(^): Suppose C' — {Sj^,..., Sj^} C C is an exact (disjoint) 
cover of X. Then consider the partition [Vi, ..., Vg] of U, where 
Vi G Vk iff Xi G Sjj^. Notice that each 14 consists of exactly three 
vectors. Since C' is an exact cover of X, each element x G A’ ap¬ 
pears in exactly one subset Sk C C'. Thus, maxi<^<m ~ 

3 and hence Y^l^-^{indiXi<i<m = 3^, showing J is 

a YES-instance. 

(<;=): Let tt be a partition of V such that 

YveVi ^ witnessing the fact that J 

^Expected Component Sum is also NP-hard fl^ . 





is a YES-instance. Observe that any block T G tt with > 3 vectors 
in it cannot contribute more than 3 to the sum above. As well, any 
block T G TT with < 3 vectors will surely contribute less than 3 to 
the sum above. Since the overall sum is 3g, it follows that every 
block must have exactly three vectors in it. For a block T G tt, 
let dim{T) — argmax{^;^^rj. v[j] | 1 < j < m}, i.e., dim{T) 
is the dimension which maximizes the component sum. Then con¬ 
sider the collection C — {Sdim{T) G C | T G tt}. It is easy 
to verify that \C'\ = q and that every element x G Y appears in 
exactly one set S ^ C'. 

□ 

Theorem 1. The Group Formation Problem is NP-hard under 
both the least misery and aggregated voting semantics. 

Prooe. We prove hardness of the restricted version of GF, where 
/c = 1 item is to be recommended to each group such that sum of 
satisfaction measures of each group is maximized, from which the 
theorem follows. We prove this hardness by reduction from PECS. 
Given an instance X of PECS, consisting of a set of boolean vectors 
V C {0,1}"^ and an integer K, we create an instance J of GF as 
follows. Each vectors G V corresponds to a user’s preference over 
the m items, where preferences are binary. The decision version of 
GF asks whether there exists a disjoint partition of V into K groups 
gi,...,gK such that Y^f^^maDCi<i<rn{rninij^g.v[i]) > K. We 
claim that this is true iff X is a YES-instance. 

(^): Suppose there are ..., such that 

> K. This sum can never be 
> K. If we replace the min by a sum in the objective function 
above, it is easy to see that the value will be exactly |y |, showing 
X is a YES-instance. 

(<^): Suppose there is a disjoint partition of V into K subsets 
showing that X is a YES-instance. Again, replacing the innermost 
summation by a min in the objective function of PECS, will result 
in a value of K, showing J is a YES-instance. 

Hardness under the aggregated voting semantics follows trivially 
from the construction above. 

□ 

4. APPROXIMATION ALGORITHMS: LM 

In this section, we investigate efficient algorithms for group for¬ 
mation based on LM. Notice that when k — 1, the Max, Min, and 
Sum aggregation (see Section [O] ) coincide. When k > 1, basing 
the LM score on the bottom item (i.e., the k-th item in the top-k list) 
corresponds to Min aggregation while basing it on the top item cor¬ 
responds to Max aggregation, and the entire top-k set corresponds 
to Sum aggregation. Unless otherwise stated, we henceforth focus 
on Min and Sum aggregation. 

We propose two greedy algorithms, where both of them have 
respective absolute error guarantees. Algorithm GRD-LM-MIN 
is designed for LM considering Min aggregation and Algorithm 
GRD-LM-SUM is for Sum aggregation. 

For simplicity of exposition, we use our running example and 

from Section We interleave our exposition of the algorithm 
with an illustration of how it works on these examples for /c = 1 as 
well ask — 2 (i.e., top-1 or top-2 items are recommended). 

4.1 Min Aggregation 

Our proposed algorithm operates in a top-down fashion and grad¬ 
ually forms the groups. Intuitively, the algorithm consists of the 
following three high level steps. 

Step 1 - forming a set of intermediate groups: It begins with the 


user set U and leverages a preference list (£) of items for each user 
u, sorted in non-increasing order of item ratings. In our running ex¬ 
ample, for user U 2 in Example[^ = (zs, 5; Z 2 ,3;; A, 2). After 
that, the algorithm creates a set of intermediate groups. Each group 
g contains a set of users who have the same top-k item sequence, 
as well as the same preference rating for the bottom item across all 
users in the group. E.g., the group {u 2 , ue} shares the same top-1 
item (is) and the same rating for it (5). Thus, for k = 1, this is a 
valid group. On the other hand, for /c = 2, even though U 2 and uq 
share the same top -2 sequence of items (za; 22 ), they have distinct 
ratings for the bottom item, namely 3 and 2, and so they cannot be 
in the same group for k = 2 . 

Assuming Min-aggregation, the interesting observation is that, 
for each group, it is a good strategy to form these groups on the 
common top-k item sequence, as long as the group members (users) 
match on the preference rating of the bottom item. This is because 
the objective function (LM score) is based on the ratings of the bot¬ 
tom (i.e., /c-th) item among group members. On the other hand, a 
subtle point is that it is not a good strategy to consider just the bot¬ 
tom item only (even though the aggregation is on that item) instead 
of the entire top-k sequence. The reason is that the bottom item 
recommended to a group containing a user u may differ from zz’s 
personal bottom item. We next illustrate this with an example. 

Example 3. Consider a group consisting of two users zzi, zz 2 
whose individual ratings over items zi,..., Z 3 are respectively ui = 
(5,4,1) and U 2 = (1,4,5). For /c = 2, the second best (i.e., 
bottom) item for either user in isolation is Z 2 , and yet under LM 
semantics, it can be easily verified that the top -2 item list recom¬ 
mended to the group {zzi, zz 2 } is (z 2 ; A), where z* could be any one 
of the remaining items. Notice that the bottom item recommended 
to the group is different from the indidual bottom preference of ev¬ 
ery group member, even though they all shared the same item Z 2 as 
their bottom preference, with identical ratings (4). The reason this 
happened is because Z 2 ended up having the highest LM score for 
this group, among all items. When Z 2 is moved to the top position, 
no matter which other item is chosen as the top -2 (bottom) item for 
the group, its LM score is just 1 in this example. This shows that 
forming a group solely based on shared bottom item and score can 
lead to a group with a poor LM score, when k > 1. □ 

To generalize this observation, our algorithm needs to store the 
top-k common sequence as well as the rating of the item on which 
the group satisfaction (i.e., the LM score of that group) is aggre¬ 
gated. Next, we describe how one can execute this first step above 
efficiently. 

For every user zz, we create a sequence (zi, Z 2 , ..z^ : sc(u, z^)) 
comprising her top-k ranked items (in sequence) followed by the 
preference rating of the k-th (bottom) item, sc(u,i^). We use a 
hash map to hash each user u using ((zi, Z 2 , ..z^ : sc(u, z^))) as 
the key and the user id u as the value. Then, we create a heap 
Ft to store the LM scores sc{u, i^) for various users[^ This data 
structure enables us to efficiently retrieve the highest LM score, 
needed for Step 2 below. In our Example when k = 1, we 
hash user U 2 with key {is : 5) and value U 2 . We add the entry 
sc(u, i^) to the heap, i.e.. Ft.insert(sc(u, z^)). Notice that, users 
with same keys get hashed together and the associated value gets 
updated with their union after each such operation in the hash map. 
For example, {zza, ZZ 4 } gets hashed together, with key (z 2 : 5) and 
value {zz 3 ,zz 4 }. Finally, we preserve the association between the 


^It is sufficient to store the value once per intermediate group 
formed. 



hash keys and the corresponding LM scores in another data struc¬ 
ture. This operation generates the set of intermediate groups, where 
users with same keys belong to the same group. 

For our running example (Example 0, when k = 1, we form 
the following set of intermediate user groups: {u 2 ,U 6 } on item 
Z 3 , {u 3 ,U 4 } on item Z 2 , and two singleton groups for {ui} and 
{us}, since they do not share a common top-1 item. When /c = 2, 

{u 3 , U 4 } will be grouped together with key ■ 2 ) and value 

{^ 3 ,^ 4 }. This step creates the following intermediate groups: {^ 3 , U 4 }, 
and four singleton groups, ({ui}, {^ 2 }, {^ 5 }, {ue}). Observe that 
the sets of intermediate groups generated for /c = 1 and for /c = 2 
are different. 

Step 2 - greedy selection of ^ — 1 groups: Recall that the group 
formation problem requires that we form at most i groups of users. 
Observe that the objective function value Obj (see Section [O] ) is 
maximized when all i groups are formed. Accordingly, in this step, 
the algorithm runs in ^ — 1 iterations. Continuing with Example 
suppose ^ = 3, then this means that this step of the algorithm runs 
for 2 iterations. In iteration z, it retrieves the maximum element 
from the heap 1-L (i.e., highest associated LM score), extracts the 
corresponding key and uses that to output the user group from the 
hash map. After that, it deletes that {key^ value) entry from the 
hash map and deletes the corresponding LM score from the heap. 

When k — 1, in Example[^ iteration 1 outputs the group {^ 3 ,^ 4 } 
with score 5 and iteration 2 outputs {u 2 ^uq} with score 5. When 
/c = 2 , for the same example instance, iteration 1 outputs the group 
{ui} with score 3 and iteration 2 outputs {^ 2 } with score 3. 

Step 3 - forming the ^-th group: Einally, the last group ge is 
formed by considering all the remaining users from the hash map 
and a top-k LM score is assigned to this group. Eor our running 
example (Example[^, when /c = 1, this group is {ui , U 5 } with LM 
score of 1. When k — 2, this last group is {^ 3 , 1 / 4 , ^ 5 ,^ 6 } with 
LM score of 1. Our algorithm terminates after this iteration. When 
/c = 1 , the final set of groups are {^ 3 ,^ 4 }, {u 2 ^uq}, {^ 1 ,^ 5 } 
and the corresponding value Obj of the objective function is 5 + 

5 + 1 = 11. When /c = 2, the final set of groups are {ui}, {^ 2 }, 

{u 3 , U 4 , U 5 , U 6 }. The corresponding value of Obj is 3 + 3 +1 = 7. 

The pseudo-code of the algorithm is presented in Algorithm[^ 

In general, the grouping formed by Algorithm GRD-LM-MIN 
is sub-optimal. Eor Example we saw that the objective func¬ 
tion value achieved by the grouping found by GRD-LM-MIN is 
11. It may be verified that the optimal grouping for Example 
is {ui, U 3 , U 4 }, {u 2 ,uq}, {us} with an overall Obj value of 4 + 

5 + 3 = 12. 

4.2 Sum Aggregation 

The greedy algorithm for Sum aggregation, GRD-LM-SUM, ex¬ 
ploits a similar framework, except that it primarily differs in Step-1, 
i.e., in the intermediate groups formation step of the previous algo¬ 
rithm. Notice that GRD-LM-MIN forms these intermediate groups 
by bundling those users who have the same iop-k sequence, as 
well as the same preference rating for the k-th item. Obviously, 
this strategy falls short for GRD-LM-SUM, where, we are inter¬ 
ested to aggregate satisfaction over the entire /c-itemset. Therefore, 
GRD-LM-SUM forms these intermediate user groups by hashing 
users who have not only the same top-k item sequence, but also the 
same score for each item. 

More concretely, for every user u, we create a sequence (ii : 
sc{u,i ^)02 • sc(u, z^), ..z^ : sc(zz, z^)) comprising her top-/c 
ranked items and scores. We use a hash map to hash each user 
zz using (zi : sc{u,i ^)02 • sc(zz, z^), ..z^ : sc{u,i^)) as the 
key and the user id u as the value. Then, in the heap 71, we store 
the aggregated LM scores sc(zz, ^^=1 for various users - i.e.. 


Algorithm 1 Algorithm GRD-LM-MIN 

Require: Preference lists G U, k, group recommendation seman¬ 

tics Tlm, k, hash map h and heap H; 

k 

1: for every user u, generate top-k preference list jC'^ from jC'^ ; 

2: for every user u, generate top-A; item sequence and the A:-th item score 
sc(n, i.e., , --z^ : sc(iz, z^)}; 

3: For every distinct : sc{u,i^)), create h.key = 

{zj^,Z 2 ,..z^ : sc{u,i^)), h.value = {u}, where each user u has 
same sc{u, i^)\ 

4: Perform l-L.insert = sc{u, i^), associated with every unique key; 

5: Q = { values present in h}; 

6: Obj = 0; 

7: j = 0; 

8: while (j < {i — 1)) do 

9: Retrieve the maximum LM score SC from l-L{Maximum)\ 

10: Retrieve the sequence key S associated with SC\ 

11: Retrieve the value for the given key, g = h.key{S); 

12: Obj = Obj + SC; 

13: Remove S from h ; 

14: Remove SC from H; 

15: j=j + l; 

16: end while 

17: Form group g^ with all the remaining users in U ; 

18: = mmv„e 9 ( sc(u,i’^); 

19: Obj = Obj + 

20: return gi,g 2 ,... gi groups and Obj; 


we perform, H.insert{sc{u^ Examplewhen 

k — 2 , this way, users U 3 and ua are hashed together with key 
(z 2 : 5, zi : 2) and value {zz 3 , ZZ 4 }. The other four users get hashed 
in 4 individual buckets, as their top-2 item rating sequences do not 
match. We insert a score of 5 + 2 = 7 in the heap for users U 3 and 
ZZ 4 , and the respective sum scores of top -2 items for the other four 
users. 

Steps 2 and3 of GRD-LM-SUM are akin to that of GRD-LM-MIN, 
except for the obvious difference, that the group satisfaction score 
is now aggregated over all /c-items. We omit the details for brevity. 
Using Example this algorithm may form the following three 
groups at the end - { 1 x 3 , ZZ 4 }, {zzi, 1 x 5 , zze}, {^ 2 } with the total 
objective function value as (5 + 2) + (1 + 1) + (5 + 3) = 17. 
Appendix gives an example where the grouping produced by 
GRD-LM-SUM is suboptimal. 

4.3 Analysis 

In this subsection, we analyze the performance of the greedy al¬ 
gorithms for LM, compared to optimal grouping. We also analyze 
the running time complexity of the algorithms. 

Approximation Analysis: Our analysis focuses on the absolute 
error of the greedy algorithms, compared to their optimal solutions. 

Definition 3 (Absolute error GD ). Let n be a prob¬ 
lem, I an instance of II, A(/) be the solution provided by algorithm 
A, f(A(I)) be the value of the objective function for that solution. 
Let OPT (I) be the value of the objective function for an optimal 
solution on instance I. We say Algorithm A solves the problem 
with a guaranteed absolute error of 77 , provided for every instance 
/ofn, \f{A{I))-OPT{I)\ <=7y. □ 

Theorem 2. Algorithm GRD-LM-MIN solves the group for¬ 
mation problem under LM semantics using min-aggregation with a 
guaranteed absolute error of at most where ^ IZ is the 

maximum value in the rating scale. 

Prooe. Let gi, g 2 ,... gi (resp., g'l, g 2 , ■ ■ ■ g'Oi^^ the set of groups 
formed by Algorithm GRD-LM-MIN (resp., the optimal algorithm). 





Sort the groups w.r.t. the group’s LM score and assume that the or¬ 
ders gi, and g'l, ...,g'^ are non-increasing in the group LM 
scores. 

Recall that Xg denotes the iop-k item list recommended to a 
group g under the group recommendation semantics under consid¬ 
eration, which, for this theorem, is LM. Similarly, g^ (Xg ) denotes 
the LM score of group g w.r.t. the recommended top-k item list Xg. 
Let OPT(i) := d'f (^g^) and GRD(i) := 

they are the partial sum of LM scores of the first i groups formed 
by the two algorithms. Clearly, OPT(^) and GRD(^) are the final ob¬ 
jective function values achieved by both algorithms and OPT(^) = 
0PT(-£—l)+^^^(X^p and similarly, GRD(-£) = GRD(-£— 

The overall proof hinges on the following points. 

(1) The aggregated satisfaction score generated by the best i —1 
groups of GRD is always larger or equal to that of OPT, i.e., GRD(-£ — 

1) > 0PT(^ ~ !)• We prove this below with a simple domination 
argument. 

First, notice that the LM score of gi cannot be less than that of 
g'l. By construction, gi consists of a set of users who are indis¬ 
tinguishable w.r.t. their iop-k list and bottom item score, which is 
what determines the LM score of a group; besides, gi has the high¬ 
est such bottom score and hence highest possible LM score among 
all possible groups. Let z < ^ be the smallest number such that 
the LM score of gi is less than that of p-. Removal of any user 
from gi leaves its LM score unchanged while adding a new user 
to gi cannot possibly increase its LM score. This contradicts the 
assumption. We just showed for 1 < z < ^, gi dominates g'i. Thus, 
GRD(^- 1) > 0PT(^- 1). 

(2) For the ^-th group, notice that 
since the maximum possible rating value is 

The theorem follows from (1) and (2). □ 

Theorem 3. Algorithm GRD-LM-SUM solves the group for¬ 
mation problem under LM semantics using Sum aggregation with a 
guaranteed absolute error of at most k x where ^ IZ 

is the maximum value in the rating scale. 

Prooe. (Sketch): We omit the details for lack of space but note 
that this proof uses similar reasoning as above. Akin to GRD-LM-MIN, 
only the ^-th group of GRD-LM-SUM is subject to error compared 
to OPT, where each of the /c-items can accrue at most error. 
Therefore, the aggregated error over all k items, i.e., the absolute 
error of GRD-LM-SUM, is upper-bounded by k x □ 

Running time complexity: We first describe the running time 
complexity of GRD-LM-MIN. Line 2 of Algorithm GRD-LM-MIN 
takes 0(nk) time overall to produce top-k item list per user. Line 3 
takes 0(n) time to hash all users. Adding LM score to the heap also 
takes 0(n) time overall. The while loop runs for (^ — 1) iterations 
and in each iteration the highest LM score is obtained in constant 
time from the heap, and rebuilding the heap takes O(logn) time. 
Therefore, the entire while loop takes 0(1 log n) time. Forming the 
l-ib group (lines 17-18) can take at most 0(n log k) time. There¬ 
fore, the overall complexity of the algorithm is 0 (nk-\-n-bi log n) 
or simply 0(nk + i log rz). Similarly, it can be shown that the run¬ 
ning time of GRD-LM-SUM is also 0(nk + i log n). 

5. ON APPROXIMATION ALGORITHMS : 
AV 

Next, we describe algorithms to produce groups with high sat¬ 
isfaction scores under the semantics of aggregate voting (AV) con¬ 
sidering both Min and Sum aggregation. Unlike least misery (LM), 
aggregate voting defines the satisfaction score of a group as the sum 


of the preference scores of the individual users in the group, for the 
recommended top-k itemset. 

An insight for forming good groupings under the AV semantics 
is that users who share the same top-k sequence of items could be 
grouped together, irrespective of the underlying aggregation func¬ 
tion (Min/Max/Sum). Notice that the grouping principle differs 
from that used by the greedy algorithms for LM, which look for 
not only common top-k item sequence but also a common rat¬ 
ing for the bottom (/c-th) item (GRD-LM-MIN) or for all k items 
(GRD-LM-SUM). To see why the above grouping principle is intu¬ 
itive, notice that a group formed in this way preserves the personal 
top-k list associated with each group member. Secondly, the con¬ 
tribution of this group to the overall satisfaction score of the group¬ 
ing is the sum of ratings of the bottom item (Min aggregation), or 
all /c-items (Sum aggregation). Two users who have the same se¬ 
quence of top-k item sequence therefore are best grouped together, 
irrespective of their individual item preference. Thus, grouping on 
item’s score is not a useful operation for AV semantics. 

We devise two algorithms GRD-AV-MIN (for Min aggregation) 
and GRD-AV-SUM (for Sum aggregation) that exploit the same al¬ 
gorithmic framework as that of greedy algorithms for LM. 

Min Aggregation: GRD-AV-MIN also runs in a top-down man¬ 
ner (starting with a single group with all users and forming a set 
of intermediate groups from there) and consists of three primary 
steps. Computationally, it has only two major differences with 
GRD-LM-MIN, described next: 

(1) Consider Lines 2 and 3 of Algorithm which hash every 
unique top-k item sequence and the bottom item score in the hash 
map. By contrast, as explained above, GRD-AV-MIN hashes only 
the top-k item sequence and not the k-th item score. Because of 
this, GRD-AV-MIN is likely to generate fewer unique hash keys 
(and hence fewer intermediate groups). This observation is corrob¬ 
orated by our experiments, in Section [TT] 

(2) The other difference is more obvious: the group satisfac¬ 
tion score is computed differently in GRD-AV-MIN compared to 
GRD-LM-MIN. What we store in heap XL in line 4 is the aggregated 
group satisfaction score sc(u, z^), where each user u has the 
same top-k item sequence, and sc(u, z^) is their respective bottom 
item score. 

The remaining operations of Algorithms GRD-AV-MIN and 
GRD-LM-MIN are essentially similar. 

Consider Example and assume the groups are to be formed 
using Min-aggregation function over top-2 (k = 2) recommended 
itemset under AV. 

Step-1 of GRD-AV-MIN will only group {us^u^} together as 
they have the same top -2 item sequence, h.key = (^ 2 ,U) and 
h.value — {zzs, ua}. The heap XL will insert 4 as the correspond¬ 
ing AV score. The remaining four users will form singleton groups. 

Step-2 of GRD-AV-MIN will have only one iteration (as ^ — 1 = 

1). It will retrieve that element from the heap with the highest AV 
score on the top-2 item for item zi, which is 4. Consequently, it 
will produce {zzs, ua} as the first group. The top-2 itemset for this 
group will be (z 2 , zi). 

Step-3 of GRD-AV-MIN will form the second group by merg¬ 
ing the remaining singleton groups into {ui, zz 2 , zzs, zze}. The AV 
score on the top-2 item is 9 considering item Z 2 . This group will be 
recommended the following top-2 itemset, (zs, Z 2 ). The algorithm 
terminates after that and achieves the objective function value 13. 

Notice that, GRD-AV-MIN may produce sub-optimal answers 
as well. For Example the optimal two user groups are instead 
{zzi,ZZ 3 ,U 4 }, {zz 2 ,U 5 ,zze}. In this case, the first group has the 
same recommended item list as that of the first group of GRD-AV-MIN, 


however, the second group has {22,^3} as the recommended item- 
set. The overall objective function value is now 14, which is higher. 

Sum Aggregation: Operationally, there is no difference between 
GRD-AV-MIN and GRD-AV-SUM, except for the obvious differ¬ 
ence, that the latter aggregates the group satisfaction score over 
the entire /c-itemset (not just on the k-th item). Using Example 
again, Step-1 of GRD-AV-SUM will group {^3,^4} together, as 
they have the same top-2 item sequence, h.key = (^2,U) and 
h.value = {^3,^4}, but will insert (5 + 2) + (5 + 2) = 14 
as the corresponding AV score in the heap. Other than that, the 
rest of the users will form four singleton groups. GRD-AV-SUM 
will result in the same set of user groups as that of GRD-AV-MIN 
but the overall objective function value is 14 + 20 = 34, as the 
second group {ui, U 2 , U 5 , uq} will now have a satisfaction score 
of (4 + 3 + 3 + l) + (l + 4 + 2 + 2) = 20 based on the Sum- 
aggregation. 

5.1 Analysis 

Akin to Section |4.3| we present both qualitative and runtime 
analyses of the greedy algorithms for AV. 

Qualitative Analysis: Unlike GRD-LM algorithms, greedy algo¬ 
rithms for AV do not come with any guarantees about the total sat¬ 
isfaction score of the grouping they provide. While at this time the 
approximability of optimal group formation under AV semantics is 
open, we conjecture that the problem is MAX-SNP-Hard (H and 
cannot be approximated within a constant factor. We next give an 
example to bring out the subtleties of AV semantics. The point is 
that, by grouping a user u with others such that the resulting top- 
k order is personally arguably worse for user u, can still produce 
a group with higher group satisfaction score, than if u had been 
grouped with users with the same top-k item list 

Example 4. Considerfourusersui, ...,U 4 andtwoitemsziU2. 
Let the ratings for the users, respectively be ui = (5,4), U2 = 
U 3 = (4, 5), and U 4 = (3, 2). Let k — 2. Suppose we wish to 
form two groups. Considering Min aggregation, grouping based on 
common top-2 item list would produce the groups {ui , ua} (satis¬ 
faction score 4+2 = 6) and {^2,^3} (satisfaction score 4+4 = 8) 
for an overall satisfaction score of 14. However, suppose ui is 
grouped together with U 2 ,U 3 and ua is left alone. The top-2 list 
for the group {ui, U2, U3} becomes (22; A), whereas ii is ui’s fa¬ 
vorite. Yet, the satisfaction scores are 5 + 4 + 4 = 13 for the 
first group and 2 for the second, for a total satisfaction score of 15. 
Even though ’s top-2 order changed to something sub-optimal 
for ui, the overall satisfaction has improved! This kind of behavior 
is impossible under LM semantics. This illustrates that it’s tricky 
to reason about forming groups that even approximate the optimal 
satisfaction score for AV semantics. □ 

Running time complexity: Running time of the greedy algo¬ 
rithms of AV is similar to that of the LM algorithms, except that the 
group satisfaction score needs to iterate over all the users to com¬ 
pute the sum or iterate over all k items for the Sum-aggregation. 
Therefore, adding AV scores to the heap for GRD-AV-MIN and 
GRD-AV-SUM now take 0{nk) time. The while loop will over¬ 
all take 0{i\ogn) time. The last step will now take 0(nk) time. 
Therefore, both the algorithms accept same time complexity, i.e., 
0 (nk + ilogn). 

6. DISCUSSION 

In this section, we present some extensions to our proposed group 
formation framework. In particular, we describe how to extend 


Sum Aggregation to consider differential weights, as briefiy dis¬ 
cussed in Section [23] Intuitively, Weighted Sum Aggregation can 
assign different weights to the top-k items and not treat them equally. 
We next describe two natural alternatives. 

Weights at the item list level: Eor any group, we can assign a 
weight to each of the top-k recommended items, where the weights 
could simply be inversely proportional to the position or its log¬ 
arithm. This way, the top items will have higher weight than the 
lower ones. Then instead of Sum, we compute Weighted Sum LM 
or AV. This extension does not introduce any complications to our 
proposed algorithms, as we only need to consider the weights when 
the overall objective function value is calculated. 

Weights at the user level: A more interesting scenario is to con¬ 
sider weighted aggregation at the user level. More specifically, how 
satisfied an individual user is with the recommended top-k items 
could be measured using IR techniques, such as, NDCG (Normal¬ 
ized Discounted Cumulative Gain) j^. Using a graded relevance 
scale, NDCG computes the user satisfaction (i.e., gain) for an item, 
given its position in the result list. The gain is then aggregated from 
the top of the item list to the bottom and the gain of each item is 
discounted at lower ranks. After weighted satisfaction of each user 
is computed, any group recommendation semantics (such as, LM 
or AV) could be used to compute the group satisfaction. Our pro¬ 
posed solutions do not require any significant change even here, 
except for the fact that the user satisfaction will be computed in a 
weighted fashion that our objective function must account for. No¬ 
tice that, except for the ^-th group in our greedy algorithm, all the 
users in the first 1—1 groups are fully satisfied, i.e., the recom¬ 
mended top-k lists exactly match their individual top-k lists. Only 
for users in the ^-th group, dissatisfaction may occur, which does 
not affect the theoretical guarantees. 

7 . EXPERIMENTAL EVALUATIONS 

We evaluate our proposed algorithms w.r.t. their effectiveness 
and efficiency. We also conduct a small scale user study on Amazon 
Mechanical Turk (AMT) to evaluate effectiveness. 

The development and experimentation environment uses Python 
on a 2.9 GHz Intel Core i7 with 8 GB of memory using OS X 10.9.5 
OS. We use IBM CPLEX for solving the IP instances. All numbers 
are presented as the average of three runs. 

Datasets: (1) Yahoo! Music: This dataset represents a snapshot 
of the Yahoo! Music community’s preferences for various songs. 
Standard pre-processing for collaborative filtering and rating pre¬ 
diction was applied to prepare this data set. The data has been 
trimmed so that each user has rated at least 20 songs, and each 
song has been rated by at least 20 users. The data has been ran¬ 
domly partitioned so as to correspond to 10 equally sized sets of 
users, in order to enable cross-validation. We use a subset of this 
dataset in our experiments. The ratings values are on a scale from 
1 to 5, 5 being the best. More information about this dataset can be 
found at Yahoo! Research Alliance Webscope progranQ 

(2) MovieLens: We use the MovieLens lOM ratings dataset]^ 
MovieLens is a collaborative rating dataset where users provide 
ratings ranging on a 1-5 scale. Table [3] con tains the statistics of 
both these datasets. Additionally, SectionjT^conducts a user study 
using Flickr data. 

Algorithms Compared: In addition to the greedy algorithms 
(Sections|^and[^, we also developed optimal algorithms for group 
formation, based on integer programming (IP) (Appendix [A|). In 
addition to these, we also implemented two baseline algorithms 

^http://research.yahoo.com 
^ http://movielens. umn. edu 



(BaseLine-LM and BaseLine-AV), described below, by adapt¬ 
ing prior work |22| . All three aggregation functions (Min/Max/Sum) 
are considered. 


dataset name 

# users 

# items 

Yahoo! Music 

200,000 

136736 

MovieLens 

71,567 

10,681 


Table 3: Dataset Descriptions 


The baseline algorithms work as follows: For every user pair 
we measure the Kendall-Tau distance between them 
based on their individual ranking of items, induced by the ratings 
they provide. This way, we obtain dist{u, u) for each u^u . No¬ 
tice that it is not sufficient to consider only top-k items to compute 
this ranked distance, because two users may have a very small over¬ 
lap on their top-k itemset; therefore, we consider all the items to 
obtain dist{u,u'). After that, we use K-means clustering |l3) to 
form a set of I user groups. Once these groups are formed, for each 
group, we compute the top-k item list and respective group satisfac¬ 
tion scores (using Min/Max/Sum aggregation) based on LM or AV 
semantics. We aggregate these scores over i groups to produce the 
final objective function value. The maximum number of iterations 
in the clustering is set to 100 by default. 

Experimental Analysis Setup: Wherever applicable, we com¬ 
pare the aforementioned algorithms both qualitatively and quantita¬ 
tively. For evaluation of quality, we measure the objective function 
value (for AV or LM), as well as average group satisfaction score 

= l Yfj = i sc(gx,i^) 


on the top-k item lists across the groups. 


, where 


sc{gx,i^) is the average j-th item for group px- We additionally 
present the distribution of group sizes to examine whether our so¬ 
lution can give rise to many degenerated groups (i.e., singleton 
groups). For scalability experiments, we primarily measure the 
clock time to produce the groups and their respective top-k item 
list. We typically vary number of users (n), number of items (m), 
number of groups (i), and the number of recommended items (k). 
In user study, we evaluate the effectiveness of our group formation 
algorithms compared with the baselines. 

Preview of Experimental Results: Our key findings are: (1) We 
find that our proposed group formation algorithms effectively max¬ 
imize the objective function compared to the optimal algorithms, 
i.e., the average group satisfaction, for all three aggregation func¬ 
tions (Min/Max/Sum). (2) Our results indicate the practical use¬ 
fulness of Min aggregation, where the average aggregate group 
satisfactions over the entire top-k item lists are presented. Even 
though Min aggregation only optimizes on the k-th item, our re¬ 
sults demonstrate high aggregate user satisfaction over the entire 
list. (3) We observe that our solution produces groups that are quite 
balanced in size, i.e., the variation in size is small. This obser¬ 
vation establishes that our greedy algorithms are also practically 
viable. (4) We observe that our proposed algorithms are scalable 
and form groups efficiently, even when the number of users, items, 
or groups is large. We also observe that we outperform the baseline 
algorithms quite consistently in all cases - both qualitatively and 
w.r.t. performance (efficiency, satisfaction scores). (5) Our user 
study results indicate that, with statistical significance, our opti¬ 
mization guided group formation algorithms produce user groups, 
in which, the users are more satisfied with the top-k recommen¬ 
dations than that of the baseline algorithms. This observation is 
consistent across the datasets. Sections [7T]|7.2[ and |7. 3 [ present the 
quality, scalability, and the user study results, respectively. 

For lack of space, we only present a subset of results. The results 
are representative and the omitted ones are similar. 


7.1 Quality Experiments 

The IP-based optimal algorithms do not complete in a reasonable 
time, beyond 200 users, 100 items, and 10 groups. Our default 
settings are as follows: number of users = 200, number of items 
= 100, number of groups = 10, /c = 5 and Max-aggregation. We 
vary # users, # items, # groups, and k in the top-k list. We measure 
two quality metrics: (1) the objective function value, i.e., the total 
satisfaction score of a grouping under LM or AV semantics, (2) the 
average group satisfaction score over all the recommended top-k 
items, (3) present the distribution of group sizes for both LM and 
AV. 

Interpretation of Results: We observe that the GRD algorithms 
outperform the corresponding baseline algorithms, over both of 
these datasets. With increasing number of users, the objective func¬ 
tion value as well as the average group satisfaction on the recom¬ 
mended top-k itemset decrease for a given value of the number of 
groups i, as larger number of users typically add more variance 
in user preference. On the other hand, with increasing number of 
groups, both of these values increase, as there is more room for 
similar users to be grouped together, thereby improving overall sat¬ 
isfaction. With increasing k, both of these values decrease again 
(except Sum aggregation). 

7.1.1 Measuring Objective Function 

For lack of space, we only present the results for Yahoo! Music 
dataset. Results on MovieLens are similar. 

Number of users: We vary the number of users and use the de¬ 
fault settings for the rest of the parameters. Figuredepicts the 
results. With increasing number of users, the objective function 
value decreases in general, because, more users typically introduce 
larger variation in the preference and smaller LM score. The re¬ 
sults also clearly demonstrate that GRD-LM-MAX consistently out¬ 
performs Baseline-LM-MAX and achieves an objective function 
value comparable to OPT-LM-MAX. 

Number of items: Figureshows that with increasing num¬ 
ber of items, the objective function value typically increases. This 
is also intuitive, because, the top items of each group are likely to 
improve leading to higher LM score. Also, GRD-LM-MAX consis¬ 
tently outperforms the Baseline-LM-MAX. 

Number of groups: When number of groups is increased, the 
overall objective function value also increases. Notice that the ob¬ 
jective function reaches its maximum possible value when number 
of groups equals the number of users. Therefore, with more groups, 
users get the flexibility to be with other users who are “similar” to 
them. Figure [represents these results. Similar to the above two 
cases, GRD-LM-MAX outperforms the baseline. 

Top-k on Min and Sum aggregation: In this experiment, 
we vary k and produce the objective function value over the least 
preferred (i.e., bottom) item on the recommended top-/c item list. 
For Min-aggregation, Figure |2(a)| shows that with increasing k, 
the objective function value decreases across the algorithms. This 
is natural, because, group satisfaction based on the bottom ele¬ 
ment typically decreases with increasing value of k. Figure |2(b)| 
shows the Sum aggregation , where the objective function value in¬ 
creases across the algorithms with increasing k, although the rate 
of increase is smaller for higher k. These results demonstrate that 
GRD-LM algorithms are highly effective, their respective objective 
function values are quite comparable with that of OPT-LM. 

7.1.2 Avg Group Satisfaction Over top-k List 

For lack of space, for this set of experiments, we only present 
the results on MovieLens dataset considering AV semantics. Here, 
we measure the average user satisfaction over all the recommended 











top-k items, i.e., —^ ac gx,i ^ sc{gx,i^) is the aver¬ 

age AV score on the j-th item for group Qx using Min aggregation. 
While GRD-AV-MIN is not specifically optimized for this mea¬ 
sure, our experimental results indicate that the formed groups have 
very high average satisfaction nevertheless. 

Number of users: Figure [3^ presents the results where we vary 
the number of users. Notice that for 10 user groups, the maximum 
possible satisfaction per group over the top-k item list could be as 
high as 25 when 5 items are recommended (and the ratings are in 
the scale of 1 — 5). This is indeed true, because, ^ 

25. Interestingly, GRD-AV-MIN consistently produces a score that 
is close to 25 and outperforms the baseline algorithm. 

Number of items: When number of items is increased, the group 
satisfaction score is likely to increase, as the algorithms now have 
more options to recommend the top-k items from, for each group. 
Figure [3(b)] presents these results. The average group satisfaction 
score of both algorithms improves slightly with more items and 
again GRD-AV-MIN consistently beats the baseline. 

Number of groups: The results are shown in Figure [3^ As 
expected, we observe that the aggregated group satisfaction over 
the top-Zc items improves with the increasing number of groups. As 
explained before, with increasing number of groups, the algorithms 
have more fiexibility to form groups with users who are highly sim¬ 
ilar in their top-Zc item preferences. 

Top-Zc on Min-aggregation: Finally, we vary top-Zc and com¬ 
pute the aggregated group satisfaction score over all top-Zc items 
(results in Figure [3(d^ . With increasing Zc, the aggregation is done 
over more number of items, thus increasing the overall score. As 
shown in the figure, GRD-AV-MIN produces highly comparable re¬ 
sults with that of OPT-AV-MIN and consistently outperforms the 
baseline algorithm Baseline-AV-MIN. 



(a) (b) 


Figure 2: We measure the objective function value by varying top-Zc 
for Min and Sum-aggregation using Yahoo! Music. The default param¬ 
eters are # users = 200, # items = 100, # groups = 10, Zc = 5. 


7.1.3 Distribution of Group Sizes 
We randomly select 200 users and 100 items with the objective 
to form ^ = 10 groups and recommend top-Zc (Zc = 5) items to 
each group using both datasets. In each sample of 200 users, we 
measure the number of users in each of these 10 groups. We re¬ 
peat this experiment 3 times and present the average variation in 
group size using a 5-point summary : average minimum size, aver¬ 
age 25% percentile (Ql), average median, average 75% percentile 
(Q3), average maximum size. This representation is akin to the 
box-plot summary on The underlying algorithms are GRD-LM 
and GRD-AV considering both Max and Sum-aggregation. These 
results are summarized in Table It is evident that the groups 
that are generated by our algorithms are balanced in general. Un¬ 
surprisingly, GRD-LM-MAX produces more uniform groups than 
GRD-LM-SUM, as the latter imposes stricter condition on group¬ 
ing members (needs to match both top-Zc sequence and ratings). 


Distribution of Average Group Size 

Semantics 

Quantile 

GRD-MAX 

GRD-*-SUM 


Minimum 

11.33 

8.33 

LM 

Ql 

15.75 

11.5 

Median 

18.5 

13.66 


Q3 

23.58 

19.33 


Maximum 

31.33 

39.33 


Minimum 

20.33 

14.33 

AV 

Ql 

22.4 

19.35 

Median 

25.4 

22.5 


Q3 

28.66 

25.95 


Maximum 

30.33 

33.75 


Table 4: Distribution of Average Group Size 


Interestingly, notice that the generated group sizes have smaller av¬ 
erage variation under AV than under LM. This is expected, because 
GRD-AV only requires users to have the same top-Zc item sequence 
(irrespective of the specific ratings on the bottom item) to belong to 
the same group. Thus, AV tends to produce relatively larger groups 
and results in smaller variation in size across the generated groups. 

7.2 Scalability Experiments 

For brevity, we present the results for only the larger dataset, 
Yahoo! Music, and present a subset of results. As mentioned ear¬ 
lier, OPT-LM and OPT-AV do not terminate beyond 200 users, 100 
items, and 10 groups, and are thus omitted. Our default settings 
here are as follows: number of users= 100,000, number of items 
= 10,000, number of groups =10, Zc = 5 and Min-aggregation. 
Again, we vary # users, # items, # groups, and Zc. 

Interpretation of Results: The running time of GRD is primar¬ 
ily affected by the number of users (n), number of groups (^), and 
Zc. Therefore, as it would be seen throughout the results, vary¬ 
ing number of items does not impact the computational cost of 
either GRD-LM or GRD-AV. Between GRD-LM or GRD-AV, the 
latter takes more time, as it has to aggregate the satisfaction of all 
the users in a single group to produce the group satisfaction score. 
Running time of GRD-LM-MIN and GRD-LM-SUM are observed 
to be comparable, which corroborates our theoretical analysis. For 
the baseline algorithms (Baseline), running time increases with 
increasing number of users, and number of groups. Additionally, 
to produce the top-Zc recommendations once the groups are formed, 
the last step of these algorithms has to sift through the item-ratings 
of all users inside every group. Given a group obtained using clus¬ 
tering, the ranked item lists of users may not be aligned. Thus, 
to form the group’s overall top-Zc list, one may have to consider 
arbitrarily many items in the individual ranked items lists of the 
group members. Therefore, the computation time of the baseline 
algorithms increases with increasing m or Zc. In case of our greedy 
algorithms, groups are formed by insisting that group members are 
aligned on the top-Zc item sequence. Thus, forming the overall top- 
Zc list for a group is straightforward in this case, for all groups but 
the ^-th group formed by the greedy algorithms. For the ^-th group, 
it sifts through the top-Zc items per user to generate score. 

7.2.1 Scalability Experiments : LM 

Number of users: We vary the number of users and measure the 
clock time of group formation and top-Zc recommendation time for 
GRD-LM-MIN and Baseline-LM-MIN. (For the record, the op¬ 
timal algorithms do not complete even after one hour.) Figure |4^ 
presents the results in minutes. As expected, GRD-LM-MIN in¬ 
creases linearly and always terminates within 2 minutes. These 
results also exhibit that the clustering based baseline algorithm is 
























Figure 1: We measure the objective function value by varying # users, # items, # groups, respectively, one at a time. The default parameters are # 
users = 200, # items = 100, # groups = 10, = 5 and Max-aggregation. The underlying dataset is Yahoo! Music. 
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yyk sc(gx,i^) 

Figure 3: We measure the average group satisfaction score on the top-fc item list —^—- by varying # users, # items, # groups, and 
top-fc, respectively, one at a time. The default parameters are # users = 200, # items = 100, # groups = 10, A; = 5. The underlying dataset is MovieLens. 


non-linear that our algorithm significantly outperforms. 

Number of items: Next, we vary the number of items and mea¬ 
sure the clock time. As can be seen from Figure [4(b^ the running 
time of our proposed algorithm is less affected by varying number 
of items. This result is also consistent with our theoretical anal¬ 
ysis. Recall from Section |4.3| that the running time of both the 
algorithms is 0{nk + i log n), which is independent of the number 
of items m. Since GRD-LM-MIN leverages the sorted iop-k list of 
items, more items do not necessarily lead to higher computational 
cost. On the contrary, the clustering based baseline has to produce 
the top-k itemset for each user group once the groups are formed. 

As explained earlier, this requires considerable work since top-k 
lists of cluster members may not be aligned. Figure |4(b)| clearly 
demonstrates that Baseline-LM is rather sensitive to the increas¬ 
ing number of items. GRD-LM-MIN of course beats the baseline. 

Number of groups: These results are presented in Figure [4^ 
When the number of groups is increased, both Baseline -LM-M IN 
and our algorithm take more time. This observation is also consis¬ 
tent with our theoretical analyses, as the running time of both these 
algorithms depends on the number of groups. However, GRD-LM-MIN 
scales linearly with the increasing number of groups and outper¬ 
form its baseline counterpart quite consistently. 

Top-Zc on Min and Sum aggregation: We vary iop-k for both 
GRD-LM-MIN and Baseline-LM-MIN and present the running 
time in Figure [5^ While GRD-LM-MIN consistently outperforms 
Baseline-LM-MIN, both these algorithms are not very sensitive 
to increasing k. The computation time of the first i — 1 groups 
are not so much affected by k and only determining the LM score 
and hence top-k list of the last group (i.e., ^-th group) is affected. 
The same observation holds for the baseline, as it incurs majority 
of its computations in forming the clusters that do not depend on k. 
Figure [5(b^ presents the running time of both GRD-LM-SUM and 
Baseline-LM-SUM. GRD consistently outperforms Baseline, 
as expected, similar to Min aggregation. 


7.2.2 Scalability Experiments : AV 


Number of users: In this final set of scalability experiments, we 
again vary number of users and compute the running time of group 
formation algorithms under AV semantics. Figure [b^ presents the 
results. These results are similar to those of LM, except that AV 
takes more time to compute than LM. Then, as expected, the run¬ 
ning time of Baseline-AV is similar to that of Baseline-LM 
in Figure [4(^ as the clustering algorithm does not exploit the (AV) 
semantics in the group formation process. Our proposed greedy 
algorithm consistently outperforms the baseline algorithm. 

Number of items: We vary the number of items and present the 
computation time in Figure [6(b)] GRD-AV-MIN takes more time 
to terminate compared to that of GRD-LM-MIN (Figure |4(^ , this 
slight increase is due to the extra computation that GRD-AV-MIN 
has to perform to aggregate AV score for each group. On the other 
hand, GRD-AV-MIN is not sensitive to the increasing number of 
items, similarly to GRD-LM-MIN. The figure clearly illustrates that 
Baseline-AV-MIN takes more time, as the number of items is 
increased. As usual, GRD outperforms Baseline. 

Number of groups: We vary number of groups and observe 
that both algorithms incur higher processing time with increased 
number of groups. Figure [b^ presents these results. As expected, 
GRD-AV-MIN scales linearly with the increasing number of groups 
and consistently outperforms the baseline algorithm. 

Top-k on Min and Sum aggregation: We vary k and mea¬ 
sure the computation time in Figures |5(c)| and Figure |5(d)| With 
increased values of k, running time increases overall. However, 
GRD-AV-MIN takes significantly less time to terminate, compared 
to its baseline counterpart. Computation times of Baseline -LM-M IN 
and Baseline-AV-MIN are similar for the same values of k, 
as these baseline algorithms do not make use of the underlying 
group recommendation semantics during the group formation pro¬ 
cess. Sum aggregation results are presented in Figure [5(d^ and the 
behavior is consistent, as before. 

7.3 User Study 

We use publicly available Flickr data to set up the user study in 
AMT for New York city. Given a Flickr log of a particular city, each 


































Figure 4: We present the average running time (measured in minutes) of the group formation algorithms under LM semantics with varying # users, 
# items, # groups respectively, one at a time. The default parameters are # users = 100, 000, # items = 10, 000, # groups = 10, = 5, considering 

Min-aggregation. The underlying dataset is Yahoo! Music. 



Figure 5: We present the average running time (measured in minutes) varying top-fc, using Sum and Min aggregation for LM and AV. The default 
parameters are # users = 100, 000, # items = 10, 000, # groups =10. The underlying dataset is Yahoo! Music. 


row in that log corresponds to a user itinerary that is visited in a 12- 
hour window. From this log, we extract the most popular 10 POIs. 
The user study is designed in two phases overall, where Phase 1 is 
used to create three different sets of users - similar, dissimilar, and 
random. Phase 2 is used to assess the performance of the algorithms 
on each of these sets, under different semantics and aggregation 
functions. 

Phase 1: Preference Collection and Group Formation: First, 
we set up a HIT (Human Intelligence Task) in AMT, where we ask 
each AMT user to rate one of these 10 POIs on a scale of 1 — 5, 
higher rating implying greater preference. This data is collected 
from 50 workers. From this collected dataset, we create user sam¬ 
ples. Sampling is conducted to select a seed user. 

Similar user sample: We select a subset of 10 users who have 
provided very similar ranking on the 10 POIs. We compute nor¬ 
malized pair-wise similarity, considering each item in the top-10 
ranked item lists for each user pair and aggregating that over all 10 
items, as follows: sim{u^u) — 


sim{u, u ,j) 



\sc{u,i^ ) — sc{u',i^ ) I 
5 

Otherwise 


if 


Dissimilar user sample We select a different subset of 10 users 
who has the smallest aggregate pair-wise similarity. 

Random user sample In our third sample, we select another subset 
of 10 users who are chosen randomly from the 50 users (workers). 


For brevity, we only report results on LM semantics for these 
experiments and set the number of groups to be ^ = 3. We apply 
GRD-LM and Baseline-LM (both Sum and Min) to each sample 
and each algorithm produces three groups. 

Step 2: Group Satisfaction Evaluation: In this phase, for each 
user sample (similar, dissimilar, and random), we set up 6 HITs in 
AMT (3 for Min and another 3 for Sum), where each HIT com¬ 
prises the (3 + 3) groups created by GRD-LM and Base line-LM. 
In each HIT, we first show the individual user preference ratings 
for all 10 users in the sample, over all 10 items. We do not dis¬ 


close the underlying group formation algorithm (but refer to them 
as Method-1 and Method-2) and produce the groups formed by 
GRD-LM and Baseline-LM. We note that our settings mimic 
the set-up of previous user studies in group recommendation re¬ 
search pA) [^. We also request the worker to regard herself 
as one of the individuals in the sample and ask her to rate the fol¬ 
lowing questions (higher is better): (1) Her satisfaction with the 
formed groups by Method-1; (2) Her satisfaction with the formed 
groups by Method-2, (3) In an absolute sense, which method she 
prefers more. Each HIT is undertaken by 10 unique users, thereby 
involving 60 new users in this phase (30 for Min and another 30 
for Sum). For each HIT, we average the ratings and present them in 
Figures [7(b^ and [7^ Standard error bars are added for statistical 
significance. 

Additionally, we aggregate and compute the percentage of users 
who prefer GRD-LM versus Baseline-LM in Figure [7^ 

Interpretation of Results: We make the following key obser¬ 
vations from the results presented in Figures [7^ |7(b)|[7(c)| First 
and foremost, our proposed algorithm GRD-LM gives rise to higher 
satisfaction compared to the baseline, in all cases. In fact, this dif¬ 
ference in satisfaction seems to be higher when the user population 
is dissimilar in its individual preferences. During our post-analysis, 
we see, indeed the clustering based baseline algorithm becomes in¬ 
effective, when the individual user preferences are dissimilar from 
each other. For the same reason, the difference in the average satis¬ 
faction of our greedy algorithm from the baseline algorithm is the 
highest for dissimilar users and smallest for similar users. A ran¬ 
dom user population consists of both similar and dissimilar users, 
hence the effectiveness falls in the middle. These results clearly 
demonstrate that our proposed solutions can effectively exploit ex¬ 
isting group recommendation semantics and form groups that lead 
to high group satisfaction in practice. 

8 . RELATED WORK 

While no prior work has addressed the problem of group forma¬ 
tion in the context of recommender systems, we still discuss exist¬ 
ing work that appears to be contextually most related. 








































Figure 6: We present the average running time (measured in minutes) of the group formation algorithms under AV semantics with varying # users, 
# items, # groups respectively, one at a time. The default parameters are # users = 100, 000, # items = 10, 000, # groups = 10, = 5, considering 

Min-aggregation. The underlying dataset is Yahoo! Music. 
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Figure 7: User Study Results: Figures [7(b) | and [re present the average user satisfaction score of GRD-LM and that of Baseline-LM for both Min 
and Sum aggregation with statistical significance analyses. Figure [T^ indicates about 80% of the users prefers GRD over the Baseline algorithms. 


Group Recommendation: Group recommendation has been de¬ 
signed for various domains such as news pages p4) , tourism 
music 0, book, restaurants and TV programs pT] . 

There are two dominant strategies for group recommendations 
The first approach creates a pseudo-user representing the 
group and then makes recommendations to that pseudo-user, while 
the second strategy computes a recommendation list for each group 
member and then combines them to produce a group’s list. For 
the latter, a widely adopted approach is to apply an aggregation 
function to obtain a consensus group preference for a candidate 
item. Popular aggregation functions, such as, least misery, aggre¬ 
gate voting are popularly used in existing works fT^[T]|^. ( 22 ) 
pre-clusters the users and the individual recommendations are gen¬ 
erated for group members using that member’s cluster. After that 
group aggregation function is applied. 

In these works, groups are created beforehand, either by a ran¬ 
dom set of users with different interests, or by a number of users 
who explicitly choose to be part of a group. 

Market-based Strategies: Existing research on market based 
strategies |[^[^|^ studies problems on daily deals sites, such as 
Groupon and LivingSocial, that are orthogonal to our group for¬ 
mation problem. These works focus on recommending deals (i.e., 
items) to the users and groups For example, proposes new 
algorithms for daily deals recommendation based on the explore- 
then-exploit strategy. pO) recommends the best deals to the users, 
among a set of candidate deals, to maximize revenues. Real-time 
bidding strategy for group-buying deals based on the online opti¬ 
mization of bid values is studied in Q. This body of works es¬ 
sentially relies on price discounts to incentivize group formation 
around deals, with the objective of maximizing revenue. By con¬ 
trast, we investigate how to form groups to maximize user satisfac¬ 
tion under existing group recommendations semantics. Our work 
is directly deployable as a non-intrusive addition to group recom- 
mender systems where explicit incentives may not be present. 

Team Formation: Team formation problems (H13 HI 00 
are often modeled using Integer Programming, or heuristic solu¬ 
tions using Simulated Annealing 0 or Genetic Algorithms 


are designed. These problems are assignment problems. 

In general, group formation is not a matching or generalized as¬ 
signment problem p7| [T0| . There are no resources to match the 
users to. We need to “match” users to one another. In that sense, 
it’s closer in spirit to clustering GD However, as we demonstrate 
in the paper, a clustering algorithm which is agnostic to the group 
recommendation semantics (LM or AV) is likely to perform poorly 
for purposes of maximizing group satisfaction. 

Community Detection: These problems p8) discover commu¬ 
nities (a set of users) with common interests. Again, our groups 
have a more explicit connotation (in the sense of having clearly de¬ 
fined satisfaction scores) than communities. One can potentially 
generate a graph of users based on a suitable notion of distance or 
similarity between users in terms of tastes and find communities. 
Once again, this approach falls short, as it does not take group rec¬ 
ommendation semantics into account. 

Multi-way Partition: Minimization problems over multiway 
partition functions are studied in p^ 0 , with graphs or hyper¬ 
graphs as the underlying abstract model. They attempt to parti¬ 
tion the nodes to optimize certain outcome (e.g., variants of k-cni 
problems). While these problems are NP-hard, efficient algorithms 
with provable approximation factors are known, when the objective 
function exhibit certain properties [^0. If we are to use such a 
weighted graph, the weight on each edge is local to just two users 
and does not capture the essence of group recommendation seman¬ 
tics, which renders those solutions to our problem far from ideal. 

9. CONCLUSION 

We initiate the study of forming groups in the context of group 
recommender systems. We consider two popular group recommen¬ 
dation semantics (LM and AV) and formalize the problem of cre¬ 
ating a set of non-overlapping groups over an underlying user pop¬ 
ulation, such that the aggregate satisfaction of the formed groups, 
with their recommended iop-k lists, is maximized. We prove that 
optimal group formation is computationally intractable under both 
group recommendation semantics. We present efficient greedy 
group formation algorithms and show that they achieve absolute 
























error guarantees for LM. We present a comprehensive experimen¬ 
tal analysis and user studies that demonstrates the effectiveness as 
well scalability of our proposed solutions. 

The approximability of group formation under AV semantics re¬ 
mains open although we conjecture that it may be hard to approxi¬ 
mate. Identifying natural special cases that are tractable is an inter¬ 
esting open problem. Forming groups where the individual mem¬ 
bers are not treated equally, or groups that are possibly overlapping 
are also worthy of study. 


APPENDIX 

A. OPTIMAL ALGORITHMS 

Despite the fact that the optimal group formation problem is 
computationally intractable, we describe optimal algorithms under 
both LM and AV semantics by formulating them as integer pro¬ 
gramming problems. We can make use of existing integer program¬ 
ming solvers (such as CPLEX) to solve these problems. Since IP is 
also NP-hard pO) and could be exponential in the worst case, the 
proposed solutions are not scalable. Nevertheless, the formulation 
is useful when the numbers of users and items are fairly small. 

For both formulations the following Boolean decision variables 
are defined: uig captures whether user m is part of group g. To 
describe the top-k itemset of each group, an additional Boolean 
decision variable, yjg checks, if item j is the k-th item for group g, 
whereas, wjg is used to check if item j is one of the top-(k — 1) 
items for group g. The following formulation is provided assuming 
Min-aggregation for a general value of /c > 1. To consider Max- 
aggregation formulation, we no longer need Wjp and one has to 
check if yjg is indeed the top-1 item for the recommended itemset. 
Similarly, Sum aggregation could be performed by modifying the 
objective function to aggregate over all /c-items. 

A.l IP Formulation for Least Misery 

The objective function for LM aims to form i groups such that 
their sum of scores is maximized. The first two constraints capture 
the fact that the score of an item j is to be computed by considering 
the minimum score of that item over all users. The third and fourth 
constraints are used to capture the top-(/c — 1) items. The rest of the 
constraints simply state that the k—th item is a single item for every 
group, whereas, there should be a total of k — 1 additional items 
whose score is higher than that of the k-th item. Finally, we assert 
that only i groups are to be formed, whereas, a user can belong to 
only one of these groups (satisfying disjointness). 

Maximize x sc{g,j)} (1) 

s-t. sc{j,g) = r' 

r<{V« ig — f}uig X sc{ui,j) 

> 

Wig X sc{g,i) > yjg x sc{g,j) x Wig 

(1 - Wjg) X sc(s, j) < yjg X sc{g,j), 

= 1] 

^T=img = k-l\ 

i < n) 

Vjg = 1/0,Vj = 1,2, 

'^jg = 1/0,Vj = 1,2,... ,m I 

T^l=iUig = 1, Vz = 1, 2,..., n I 
Uig = 1/0,Vz = 1,2,... ,n,V5' = 1,2,... ,^J 

When this IP is run on Example considering k — 1, the fol¬ 
lowing groups are produced: {zzi, zzs, ZZ 4 }, {zz 2 , zze}, {zzs} with an 
overall Obj value of 4 + 5 + 3 = 12 . 

A.2 IP Formulation for Aggregate Voting 

The formulation of optimal group formation under aggregate vot¬ 
ing is similar to that of LM, except for the fact that the score of an 
item j for a group g is the summation of scores of j over all users 
in^. 


Maximize x sc{g,j)} 


( 2 ) 



s.t. 


Sc{9,j) = ^{Vnig = l}Uig X Sc(Mi, j)'| 

Wig X sc{g,i) > Vjg X sc{g,j) x Wig > 

(1 - Wjg) X sc{g,j) < Vjp X sc{g,j)) 

^T=m9 = 1 ] 

^T=lWig = k-l\ 

£ < nj 

Vjg = 1/0,Vj = 1,2,...,m' 

Wjg = 1/0, Vj = l,2,...,m 

^g=lUig - l,Vz - 1,2,. ..,77/ 

Uig = 1/0, Vz = 1, 2,... ,n, = 1, 2,... ,^, 

When run on Example]^ the optimal grouping with two groups 
consists of the following groups: {zzi, zzs, ^ 4 }, and { 7 / 2 , zzs, zze}, 
with the overall objective function value of 14. 

B. SUBOPTIMAL GROUP FORMATION BY 
GRD-LM-SUM 

Example 5. Consider the user set 
U = {tzi, 7 / 2 ,'^ 3 ,"^^ 4 , 775 ,TZe} and the itemset X = { 71 , 72 , 73 }. 
The users’ preferences for the itemset are given (or predicted) as 
in Table 1^ Suppose the users need to be partitioned into at most 3 
groups (i < 3), where each group has to be recommended k — 2 
items. □ 

Using GRD-LM-SUM, this will form the following 3 groups at 
the end: { 772 }, { 773 , 774 }, { 771 , 775 , 776 } with the overall objective 
function value as (5 + 3) + (5 + 2) + (3 + 2) = 20, whereas, the 
optimal grouping will give a different solutions to form 3 groups, 
{ 772 , 775 }, { 773 , 774 }, { 771 , 775 } with the overall objective function 
value of (5 + 2) + (5 + 2) + (4 + 3) = 21. 

User-item Ratings ui 772 773 774 775 uq 

~Ti i 2 2 2 2 r~ 

72 4 3 5 5 4 2 

73 3 5 1 1 3 5 

Table 5: User Item Preference Rating for Example]^ 
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